人工智能(AI)见证了各种物联网(IoT)应用和服务的重大突破,从推荐系统到机器人控制和军事监视。这是由更容易访问感官数据的驱动以及生成实时数据流的Zettabytes(ZB)的普遍/普遍存在的设备的巨大范围。使用此类数据流来设计准确的模型,以预测未来的见解并彻底改变决策过程,将普遍的系统启动为有价值的范式,以实现更好的生活质量。普遍的计算和人工智能的汇合普遍AI的汇合将无处不在的物联网系统的作用从主要是数据收集到执行分布式计算,并具有集中学习的有希望的替代方案,带来了各种挑战。在这种情况下,应设想在物联网设备(例如智能手机,智能车辆)和基础架构(例如边缘节点和基站)之间进行明智的合作和资源调度,以避免跨越开销和计算计算并确保最大的性能。在本文中,我们对在普遍AI系统中克服这些资源挑战开发的最新技术进行了全面的调查。具体而言,我们首先介绍了普遍的计算,其架构以及与人工智能的相交。然后,我们回顾AI的背景,应用和性能指标,尤其是深度学习(DL)和在线学习,在无处不在的系统中运行。接下来,我们从算法和系统观点,分布式推理,培训和在线学习任务中,对物联网设备,边缘设备和云服务器的组合进行了分布式推理,培训和在线学习任务的深入文献综述。最后,我们讨论我们的未来愿景和研究挑战。
translated by 谷歌翻译
The following article presents a memetic algorithm with applying deep reinforcement learning (DRL) for solving practically oriented dual resource constrained flexible job shop scheduling problems (DRC-FJSSP). In recent years, there has been extensive research on DRL techniques, but without considering realistic, flexible and human-centered shopfloors. A research gap can be identified in the context of make-to-order oriented discontinuous manufacturing as it is often represented in medium-size companies with high service levels. From practical industry projects in this domain, we recognize requirements to depict flexible machines, human workers and capabilities, setup and processing operations, material arrival times, complex job paths with parallel tasks for bill of material (BOM) manufacturing, sequence-depended setup times and (partially) automated tasks. On the other hand, intensive research has been done on metaheuristics in the context of DRC-FJSSP. However, there is a lack of suitable and generic scheduling methods that can be holistically applied in sociotechnical production and assembly processes. In this paper, we first formulate an extended DRC-FJSSP induced by the practical requirements mentioned. Then we present our proposed hybrid framework with parallel computing for multicriteria optimization. Through numerical experiments with real-world data, we confirm that the framework generates feasible schedules efficiently and reliably. Utilizing DRL instead of random operations leads to better results and outperforms traditional approaches.
translated by 谷歌翻译
Hyperparameter optimization (HPO) is essential for the better performance of deep learning, and practitioners often need to consider the trade-off between multiple metrics, such as error rate, latency, memory requirements, robustness, and algorithmic fairness. Due to this demand and the heavy computation of deep learning, the acceleration of multi-objective (MO) optimization becomes ever more important. Although meta-learning has been extensively studied to speedup HPO, existing methods are not applicable to the MO tree-structured parzen estimator (MO-TPE), a simple yet powerful MO-HPO algorithm. In this paper, we extend TPE's acquisition function to the meta-learning setting, using a task similarity defined by the overlap in promising domains of each task. In a comprehensive set of experiments, we demonstrate that our method accelerates MO-TPE on tabular HPO benchmarks and yields state-of-the-art performance. Our method was also validated externally by winning the AutoML 2022 competition on "Multiobjective Hyperparameter Optimization for Transformers".
translated by 谷歌翻译
Accurate travel time estimation is paramount for providing transit users with reliable schedules and dependable real-time information. This paper is the first to utilize roadside urban imagery for direct transit travel time prediction. We propose and evaluate an end-to-end framework integrating traditional transit data sources with a roadside camera for automated roadside image data acquisition, labeling, and model training to predict transit travel times across a segment of interest. First, we show how the GTFS real-time data can be utilized as an efficient activation mechanism for a roadside camera unit monitoring a segment of interest. Second, AVL data is utilized to generate ground truth labels for the acquired images based on the observed transit travel time percentiles across the camera-monitored segment during the time of image acquisition. Finally, the generated labeled image dataset is used to train and thoroughly evaluate a Vision Transformer (ViT) model to predict a discrete transit travel time range (band). The results illustrate that the ViT model is able to learn image features and contents that best help it deduce the expected travel time range with an average validation accuracy ranging between 80%-85%. We assess the interpretability of the ViT model's predictions and showcase how this discrete travel time band prediction can subsequently improve continuous transit travel time estimation. The workflow and results presented in this study provide an end-to-end, scalable, automated, and highly efficient approach for integrating traditional transit data sources and roadside imagery to improve the estimation of transit travel duration. This work also demonstrates the value of incorporating real-time information from computer-vision sources, which are becoming increasingly accessible and can have major implications for improving operations and passenger real-time information.
translated by 谷歌翻译
Batch Normalization (BN) is an important preprocessing step to many deep learning applications. Since it is a data-dependent process, for some homogeneous datasets it is a redundant or even a performance-degrading process. In this paper, we propose an early-stage feasibility assessment method for estimating the benefits of applying BN on the given data batches. The proposed method uses a novel threshold-based approach to classify the training data batches into two sets according to their need for normalization. The need for normalization is decided based on the feature heterogeneity of the considered batch. The proposed approach is a pre-training processing, which implies no training overhead. The evaluation results show that the proposed approach achieves better performance mostly in small batch sizes than the traditional BN using MNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100 datasets. Additionally, the network stability is increased by reducing the occurrence of internal variable transformation.
translated by 谷歌翻译
音频是人类交流最常用的方式之一,但与此同时,它很容易被欺骗人们滥用。随着AI的革命,几乎每个人都可以访问相关技术,从而使罪犯犯罪和伪造变得简单。在这项工作中,我们引入了一种深度学习方法,以开发一种分类器,该分类器将盲目地将输入音频分类为真实或模仿。提出的模型接受了从大型音频数据集提取的一组重要功能的培训,以获取分类器,该分类器已在不同音频的相同功能上进行了测试。为这项工作创建了两个数据集;所有英语数据集和混合数据集(阿拉伯语和英语)。这些数据集已通过GitHub提供,可在https://github.com/sass7/dataset上使用研究社区。为了进行比较,还通过人类检查对音频进行了分类,主题是母语人士。随之而来的结果很有趣,并且表现出强大的精度。
translated by 谷歌翻译
我们建议第一个通过对弱的微型计算机进行深入学习的实时语义细分的系统,例如Raspberry Pi Zero Zero V2(其价格\ 15美元)附加到玩具无人机上。特别是,由于Raspberry Pi的重量不到$ 16 $,并且其大小是信用卡的一半,因此我们可以轻松地将其连接到普通的商业DJI Tello玩具器中(<\ $ 100,<90克,98 $ \ \时间$ 92.5 $ \ times $ 41毫米)。结果是可以从板载单眼RGB摄像头(无GPS或LIDAR传感器)实时检测和分类对象的自动无人机(无笔记本电脑或人类)。伴侣视频展示了这款Tello无人机如何扫描实验室的人(例如使用消防员或安全部队)以及在实验室外的空停车位。现有的深度学习解决方案要么在这种物联网设备上实时计算要么太慢,要么提供不切实际的质量结果。我们的主要挑战是设计一个系统,该系统在网络,深度学习平台/框架,压缩技术和压缩比的众多组合中占有最好的选择。为此,我们提供了一种有效的搜索算法,旨在找到最佳组合,从而导致网络运行时间与其准确性/性能之间的最佳权衡。
translated by 谷歌翻译
修剪是压缩深神经网络(DNNS)的主要方法之一。最近,将核(可证明的数据汇总)用于修剪DNN,并增加了理论保证在压缩率和近似误差之间的权衡方面的优势。但是,该域中的核心是数据依赖性的,要么是在模型的权重和输入的限制性假设下生成的。在实际情况下,这种假设很少得到满足,从而限制了核心的适用性。为此,我们建议一个新颖而健壮的框架,用于计算模型权重的轻度假设,而没有对训练数据的任何假设。这个想法是计算每个层中每个神经元相对于以下层的输出的重要性。这是通过l \“ {o} wner椭圆形和caratheodory定理的组合来实现的。我们的方法同时依赖数据独立,适用于各种网络和数据集(由于简化的假设),以及在理论上支持的。方法的表现优于基于核心的现有神经修剪方法在广泛的网络和数据集上。例如,我们的方法在Imagenet上获得了$ 62 \%$的压缩率,ImageNet上的RESNET50的准确性下降了$ 1.09 \%$。
translated by 谷歌翻译
韵律在言语交流中起着至关重要的作用。韵律的声明已被广泛研究。但是,韵律特征不仅被视而不见,而且在视觉上是基于头部和面部运动的视觉上。本报告的目的是提出一种使用虚拟现实检查视听韵律的方法。我们表明,基于虚拟人的动画提供了与真正说话者视频录音相似的运动提示。虚拟现实的使用开辟了新的途径,以检查口头交流的多模式效应。我们讨论了研究人工耳蜗听众中韵律感知的框架中的方法。
translated by 谷歌翻译
由于电容层析成像(ECT)应用在几个工业领域的快速增长,因此从原始电容测量中开发出高质量但快速的图像重建方法的需求。深度学习是一种有效的非线性映射工具,用于复杂功能,在包括电断层扫描在内的许多领域都流行了。在本文中,我们提出了一个条件生成对抗网络(CGAN)模型,用于重建电容测量的ECT图像。 CGAN模型的初始图像是根据电容测量构建的。据我们所知,这是第一次以图像形式表示电容测量。我们创建了一个新的大规模ECT数据集,该数据集的320K合成图像测量对进行训练和测试所提出的模型。使用测试数据集,受污染的数据和流动模式评估所提出的CGAN-ECT模型的可行性和概括能力,这些数据集在训练阶段未暴露于模型。评估结果证明,与传统和其他基于学习的图像重建算法相比,提出的CGAN-ECT模型可以有效地创建更准确的ECT图像。 CGAN-ECT达到的平均图像相关系数超过99.3%,平均相对图像误差约为0.07。
translated by 谷歌翻译